Author

Nathan Herling

Published

June 13, 2025

0 - Setup

[FYI]
'pacman' already installed — skipping install.
[FYI]
'dsbox' already installed — skipping GitHub install.
The packages loaded:
* tidyverse           * glue                * scales              * lubridate            
* patchwork           * ggh4x               * ggrepel             * openintro            
* ggridges            * dsbox               * janitor             * here                 
* knitr               * ggthemes            * ggplot2             * kableExtra           
* palmerpenguins      * grid                * htmltools           * plotly               
* ggforce                                                                                

1 - A new day, a new plot, a new geom

Question #1

A new day, a new plot, a new geom. The goal of this exercise is to learn about a new type of plot (ridgeline plot) and to learn how to make it. Use the geom_density_ridges() function from the ggridges package to make a ridge plot of Airbnb review scores of Edinburgh neighborhoods. The neighborhoods should be ordered by their median review scores. The data can be found in the dsbox package, and it’s called edibnb. Also include an interpretation for your visualization. You should review feedback from your Homework 1 to make sure you capture anything you may have missed previously.

Data Analysis - Q1
Table 1. Diagnostic Summary for review_scores_rating (edibnb data set)
Metric Value
Data Type numeric
Min 20
1st Quartile 93
Median 97
Mean 95.0246657029274
3rd Quartile 99
Max 100
Missing Values 2177
IQR 6
Lower Outlier Bound 84
Upper Outlier Bound 108
Outlier Count 576
Code
# =========================================
# Visualization: Ridge Plot by Neighborhood
# =========================================

# - Filter out rows with NA in neighbourhood or review_scores_rating
edibnb_clean <- edibnb |>
  filter(!is.na(neighbourhood), !is.na(review_scores_rating))

# - Calculate median review scores by neighborhood
median_scores <- edibnb_clean |>
  group_by(neighbourhood) |>
  summarize(median_score = median(review_scores_rating, na.rm = TRUE)) |>
  arrange(median_score)

# - Reorder neighborhoods by median score
edibnb_clean <- edibnb_clean |>
  mutate(neighbourhood = factor(neighbourhood, levels = median_scores$neighbourhood))

# ==== New: Calculate mean review scores for each neighborhood (for annotation) ====
mean_scores <- edibnb_clean |>
  group_by(neighbourhood) |>
  summarize(mean_score = mean(review_scores_rating, na.rm = TRUE))

# ==== Modified plot: add geom_text for mean scores offset to the right ====

# - Get the first and last neighborhood (lowest & highest median)
low_neigh <- levels(edibnb_clean$neighbourhood)[1]
high_neigh <- levels(edibnb_clean$neighbourhood)[length(levels(edibnb_clean$neighbourhood))]

# - Maximum x-position for annotation
max_annot_x <- max(mean_scores$mean_score, na.rm = TRUE) + 25

# Create a named vector of colors for neighborhoods - color for the envelopes.
neighborhood_colors <- setNames(viridis::viridis(length(levels(edibnb_clean$neighbourhood))), levels(edibnb_clean$neighbourhood))


#===============
# Make the plot
#===============
g1 <- ggplot(edibnb_clean, aes(x = review_scores_rating, y = neighbourhood)) +
  geom_density_ridges(         # - use geom_density_ridges 
    aes(color = neighbourhood),
    fill = "cornsilk4",       # - fill in the area under the curve
    alpha = 0.7,
    scale = 2.0,              # - this makes the ridges 'larger' on the plotted area.
    bandwidth = 1,  # you can adjust this value as needed
    show.legend = FALSE
  ) +
  geom_point(
    data = mean_scores,
    aes(x = mean_score, y = neighbourhood, color = "Mean Score"),
    size = 3,
    shape = 18
  ) +
  geom_segment(
    data = mean_scores,
    aes(
      x = mean_score,
      xend = mean_score + 20,
      y = neighbourhood,
      yend = as.numeric(neighbourhood) + 0.5
    ),
    color = "black",
    linewidth = 0.3
  ) +
  ggrepel::geom_text_repel(
    data = mean_scores,
    aes(
      x = mean_score + 20,
      y = as.numeric(neighbourhood) + 0.5,
      label = round(mean_score, 1)
    ),
    hjust = 0,
    size = 3.2,
    color = "black",
    nudge_y = 0.15,
    segment.color = NA,
    direction = "y",
    box.padding = 0.3,
    point.padding = 0.5,
    show.legend = FALSE
  ) +
  annotate(             # - put an annotation next to the 'low' value
    "text",
    x = max_annot_x,
    y = 1.5,
    label = "Low",
    hjust = -1,
    size = 5,
    color = neighborhood_colors[[low_neigh]]
  ) +
  annotate(             # - put an annotation next to the 'high' value
    "text",
    x = max_annot_x,
    y = length(levels(edibnb_clean$neighbourhood)) + 0.5,
    label = "High",
    hjust = -1,
    size = 5,
    color = neighborhood_colors[[high_neigh]]
  ) +
  scale_x_continuous(name = "Review Scores Rating", limits = c(0, 150)) +
  scale_color_manual(       # - adds color to the curve envelopes..
    name = "Legend",
    values = c(neighborhood_colors, "Mean Score" = "black"),
    breaks = "Mean Score",
    labels = "Mean Score"
  ) +
  labs(
    title = "Distribution of Airbnb Review Scores by Edinburgh Neighborhood\n", 
    y = "Neighborhood",
    caption = "Source: Inside Airbnb via Kaggle (data from June 2019)") +
  theme_ridges() +
  theme(
    legend.position = "right",
    legend.title = element_text(hjust = 0.6),
    axis.title.y = element_text(vjust = 1.5, hjust = 0.5),
    plot.title = element_text(hjust = 0.5),
    axis.title.x = element_text(hjust = 0.5),
    plot.caption = element_text(              # - stylizing the caption
      hjust = 0.5,    # right align
      color = "grey50",  # lighter grey
      size = 8
     )
  ) 

# - show the graph
g1


Interpretation
The graph (Distribution of Airbnb Review Scores by Edinburgh Neighborhood) displays the distribution of Airbnb review scores across Edinburgh neighborhoods using ridgeline plots, with each neighborhood’s mean score marked by a diamond (via double encoding). The mean review scores are generally high, ranging from about 93.9 to 95.9, on a scale of 0-100. Some neighborhoods, like Morningside and Bruntsfield, show slightly higher average scores. The variation in score spread highlights differences in review consistency between neighborhoods, making it easier to compare where listings tend to receive better feedback.

2 - Foreign Connected PACs

Question #2a


Make a graph: Contributions to US political parties from UK-connected PACs.

Data Analysis - Q2
Table 1. Diagnostic Summary for dems and repubs
Variable Metric Value
dems Data Type numeric
dems Min -9050
dems 1st Quartile 2000
dems Median 11000
dems Mean 35667.7589807853
dems 3rd Quartile 40000
dems Max 853223
dems Missing Values 0
dems IQR 38000
dems Lower Outlier Bound -55000
dems Upper Outlier Bound 97000
dems Outlier Count 245
repubs Data Type numeric
repubs Min -11000
repubs 1st Quartile 3000
repubs Median 18500
repubs Mean 50162.7623224729
repubs 3rd Quartile 58000
repubs Max 812500
repubs Missing Values 0
repubs IQR 55000
repubs Lower Outlier Bound -79500
repubs Upper Outlier Bound 140500
repubs Outlier Count 236
Code
# ====================================
# Data Wrangle - Pivot and Summarize
# ===================================

# - Pivot data to longer format: create 'party' and 'amount' columns
pac_long <- pac |>
  pivot_longer(
    cols = c(dems, repubs),
    names_to = "party",
    values_to = "amount"
  ) |>
  
  # - Replace party values with readable labels
  mutate(party = recode(party,
                        "dems" = "Democrat",
                        "repubs" = "Republican"))

# - Filter for UK-origin PACs and remove NA values in amount
yearly_totals <- pac_long |>
  filter(country == "UK", !is.na(amount)) |>
  
  # - Summarize total contributions by year and party
  group_by(year, party) |>
  summarise(amount = sum(amount), .groups = "drop")


# ==================
# Create the Plot
# ==================

g2a <- ggplot(yearly_totals, aes(x = year, y = amount, color = party)) +
  geom_line(linewidth = 1.2) +  # Keep line only

  scale_y_continuous(
    name = "Total amount",
    labels = label_dollar(scale = 1e-6, suffix = "M")
  ) +

  scale_x_continuous(
    name = "Year",
    breaks = scales::pretty_breaks()
  ) +

  scale_color_manual(
    values = c("Democrat" = "blue", "Republican" = "red"),
    labels = c("Democrat", "Republican"),
    name = "Party"
  ) +

  labs(
    title = "Contributions to US political parties from UK-connected PACs",
    caption = "Source: OpenSecrets.org"
    ) +

  theme(
    legend.position.inside = c(0.9, 0.15),    # - This may cause clipping warnings
    axis.title.x = element_text(hjust = 0),   # - Align X-axis label left
    axis.title.y = element_text(hjust = 0),   # - Align Y-axis label left
    plot.caption = element_text(hjust = 1, size = 12)
  )

# - Show the plot
g2a

Question #2b


Make a graph: Contributions to US political parties from non-UK-connected PACs.
Let’s pick Switzerland.

Code
# ==================================
# Data Wrangle - Pivot and Summarize
# ==================================
pac_long <- pac |>
  pivot_longer(
    cols = c(dems, repubs),
    names_to = "party",
    values_to = "amount"
  ) |>
  mutate(party = recode(party,
                        "dems" = "Democrat",
                        "repubs" = "Republican"))

yearly_totals <- pac_long |>
  filter(country == "Switzerland", !is.na(amount)) |>
  group_by(year, party) |>
  summarise(amount = sum(amount), .groups = "drop")

# ==================
# Create the Plot
# ==================

g2a <- ggplot(yearly_totals, aes(x = year, y = amount, color = party)) +
  geom_line(linewidth = 1.2) +
  scale_y_continuous(
    name = "Total amount",
    labels = label_dollar(scale = 1e-6, suffix = "M")
  ) +
  scale_x_continuous(
    name = "Year",
    breaks = scales::pretty_breaks()
  ) +
  scale_color_manual(
    values = c("Democrat" = "blue", "Republican" = "red"),
    labels = c("Democrat", "Republican"),
    name = "Party"
  ) +
  labs(title = "Contributions to US political parties from Switzerland-connected PACs") +
  theme(
    legend.position.inside = c(0.9, 0.15),
    axis.title.x = element_text(hjust = 0),
    axis.title.y = element_text(hjust = 0)
  )

# - Show the plot
# print(g2a)
g2a

3 - Median housing prices in the US

Question #3a


Re-create the graph: Median Housing Prices in the US - not seasonally adjusted

Data Analysis - Q3
Combined Diagnostic Summary for Median Housing and Recession Data
Dataset Metric Value
median_housing date - Data Type Date
median_housing price - Data Type numeric
median_housing date - Missing Values 0
median_housing price - Missing Values 0
median_housing price - Min 17800
median_housing price - 1st Quartile 49575
median_housing price - Median 124350
median_housing price - Mean 140386.752136752
median_housing price - 3rd Quartile 223350
median_housing price - Max 374900
median_housing date - Range Start 1963-01-01
median_housing date - Range End 2021-04-01
recessions start - Data Type Date
recessions end - Data Type Date
recessions start - Missing Values 0
recessions end - Missing Values 0
recessions start - Range Start 1857-06-01
recessions start - Range End 2020-02-01
recessions end - Range Start 1858-12-01
recessions end - Range End 2020-04-01
Code
# Load the median housing data
# median_housing <- read_csv("data/median-housing.csv") |>
#   rename(date = DATE, price = MSPUS) |>
#   mutate(date = ymd(date))  #
# 
# # - Load the recession data
# recessions <- read_csv("data/recessions.csv") |>
#   rename(start = Peak, end = Trough) |>
#   mutate(start = ymd(start), end = ymd(end))  # mutate to y-m-d format

# Custom y-axis breaks (extend past 400,000 to ensure it shows)
y_breaks <- seq(0, 440000, by = 40000)

# ADDED: Custom x-axis breaks (exclude endpoints)
x_breaks <- seq(1965, 2020, by = 5) |> paste0("-01-01") |> ymd()

# Create the visualization
g3a <- ggplot(median_housing, aes(x = date, y = price)) +
  geom_line(color = "#4d72e3") +
  
  # - REMOVED: Recession shading (geom_rect layer)
  
  # - REPLACED: Restrict x-axis and apply custom breaks
  scale_x_date(
    breaks = x_breaks,
    limits = c(ymd("1963-01-01"), ymd("2021-04-01")), # x-limits
    date_labels = "%Y"                                # format to '2000'
  ) +
  
  # Custom y-axis breaks and formatted labels
  scale_y_continuous(
    breaks = y_breaks, 
    labels = comma_format(),
    limits = c(0, 400000),     # y-limits
    expand = c(0, 0)
  ) +
  
  # Axis labels
  labs(
    title = "Median sales price of houses sold in the United States\nNot seasonally adjusted",
    y = "Dollars",
    x = NULL,
    # - Source annotation
    caption = "Sources: Census; HUD"
  ) +
  
  theme_minimal() +
  
  # - Right-align the caption
  theme(
    
    plot.title = element_text(
      hjust = 0,
      margin = margin(b=15,l=-50,r=0)   # - control margins of plot title.
      ),           # Align left
    plot.caption = element_text(hjust = 1, size = 9, face = "italic"),
    plot.margin = margin(t = 5, r = 0, b = 10, l = 30),  # Increase left margin to push plot body right

    panel.grid.major.x = element_blank(),  # Remove major vertical grid lines
    panel.grid.minor.x = element_blank(),  # Remove minor vertical grid lines
    panel.grid.major.y = element_line(),   # Keep major horizontal grid lines
    panel.grid.minor.y = element_blank()   # Remove minor horizontal grid lines
    )

plot(g3a)

Question #3b


• Identify recessions that happened during the time frame of the median_housing dataset. Do this by adding a new variable to recessions that takes the value TRUE if the recession happened during this time frame and FALSE if not.
• Now recreate the following visualization. The shaded areas are recessions that happened during the time frame of the median_housing dataset. Hint: The shaded areas are “behind” the line.

Code
# Load the median housing data - assumed already loaded elsewhere
# median_housing <- read_csv("data/median-housing.csv") |> 
#   rename(date = DATE, price = MSPUS) |> 
#   mutate(date = ymd(date)) 

# Load the recession data
recessions <- read_csv("data/recessions.csv") |> 
  rename(start = Peak, end = Trough) |> 
  mutate(start = ymd(start), end = ymd(end)) 

# Custom y-axis breaks (extend past 400,000 to ensure it shows)
y_breaks <- seq(0, 440000, by = 40000)

# Custom x-axis breaks (exclude endpoints)
x_breaks <- seq(1965, 2020, by = 5) |> 
  paste0("-01-01") |> 
  ymd()

# Create the visualization
g3b <- ggplot(median_housing, aes(x = date, y = price)) +
  geom_line(color = "#4d72e3") +

  # Recession shading
  geom_rect(
    data = recessions,
    aes(xmin = start, xmax = end, ymin = -Inf, ymax = Inf),
    inherit.aes = FALSE,
    fill = "gray",
    alpha = 0.3
  ) +

  # Restrict x-axis and apply custom breaks
  scale_x_date(
    breaks = x_breaks,
    limits = c(ymd("1961-01-01"), ymd("2025-01-01")),
    date_labels = "%Y"
  ) +

  # Custom y-axis breaks and formatted labels
  scale_y_continuous(
    breaks = y_breaks,
    labels = comma_format(),
    limits = c(0, 400000),
    expand = c(0, 0)
  ) +

  # Axis labels
  labs(
    title = "Median sales price of houses sold in the United States\nNot seasonally adjusted",
    y = "Dollars",
    x = "Shaded areas indicate U.S. recessions",
    caption = "Source: Census; HUD"
  ) +

  theme_minimal() +

  # Customize theme for captions, margins, and grids
  theme(
    plot.caption = element_text(hjust = 1, size = 9, face = "italic"),
    plot.title = element_text(
      hjust = 0,
      margin = margin(b = 15, l = -50, r = 0)
    ),
    plot.margin = margin(t = 5, r = 0, b = 10, l = 30),

    axis.title.x = element_text(
    hjust = 1,                # right-align the x-axis label
    margin = margin(t = 10)   # optional: add some space above the label
    ),
    
    # Grid lines
    panel.grid.major.x = element_blank(),
    panel.grid.minor.x = element_blank(),
    panel.grid.major.y = element_line(),
    panel.grid.minor.y = element_blank()
  )

# Print plot without warnings
suppressWarnings(print(g3b))

3b-Note: Some recession rows were intentionally excluded for the purpose of the assignment.

Question #3c


• Create a subset of median_housing dataset from 2019 and 2020. Add two columns: year and quarter. year is the year of the date and the quarter takes the values of Q1, Q2, Q3, or Q4 based on date
• Re-create the visualization.

Code
# - Load the data (already loadedNo)
# median_housing <- read_csv("data/median-housing.csv") |>
#   rename(date = DATE, price = MSPUS) |>
#   mutate(date = ymd(date))

# - Filter to 2019 and 2020 only, keeping quarterly points
quarter_dates <- seq(ymd("2019-01-01"), ymd("2020-10-01"), by = "3 months")
median_housing_q <- median_housing |>
  filter(date %in% quarter_dates)

# - Define quarter labels
quarter_labels <- rep(c("Q1", "Q2", "Q3", "Q4"), times = 2)

# - Define year sub-labels (x-axis) - best I could figure out how to do
year_labels <- c("", sprintf("%25s", "2019"), "", "",
                 "", sprintf("%25s", "2020"), "", "")

# - Combine into single x-axis label with optional sub-labels
combined_labels <- ifelse(
  year_labels == "",
  quarter_labels,
  paste0(quarter_labels, "\n", year_labels)
)

#===============
# - Plot data
#===============

g3c <- ggplot(median_housing_q, aes(x = date, y = price)) +
  
  # --- Change 1 & 2: Thicken the line and make sure it does not run through points ---
  geom_line(color = "#4d72e3", linewidth = 1, lineend = "round") +  # Slightly thicker line
  
  # --- Change 1: Make circles look 'empty' by drawing points with stroke and fill white ---
  geom_point(shape = 21, size = 2, color = "#4d72e3", fill = "white", stroke = 1.2) +  # Empty circles
  
  # --- Change 3: Limit line to end exactly at Q4 by limiting x-axis ---
scale_x_date(
  breaks = quarter_dates,
  labels = combined_labels,
  limits = c(ymd("2019-01-01"), ymd("2020-10-01")),  # Already correct range
  expand = c(0.02, 0.02)  # <<< CHANGE: Prevent extension beyond last Q4 tick
) +
  
  scale_y_continuous(
    breaks = seq(300000, 360000, by = 20000),
    labels = comma_format(),
    limits = c(300000, 360000),
    expand = c(0, 0)
  ) +
  
  labs(
    title = "Median sales price of houses sold in the United States\nNot seasonally adjusted",
    x = NULL,
    y = "Dollars"
  ) +
  
  theme_minimal() +
  theme(
    plot.title = element_text(
      hjust = 0,
      margin = margin(b=15,l=-50,r=0)   # Control margins of plot title.
    ),           # Align left
    plot.caption = element_text(hjust = 1, size = 9, face = "italic"),
    plot.margin = margin(t = 5, r = 0, b = 40, l = 30),  # Increase left margin to push plot body right

    #panel.grid.major.x = element_blank(),  # Remove major vertical grid lines
    panel.grid.minor.x = element_blank(),  # Remove minor vertical grid lines
    panel.grid.major.y = element_line(),   # Keep major horizontal grid lines
    #panel.grid.minor.y = element_blank(),   # Remove minor horizontal grid lines 
    
  ) +
  # --- CHANGE: Use annotation_custom to add year labels below x-axis ---
# Add year labels between Q2 and Q3 using annotate()
annotate("text", x = ymd("2019-05-16"), y = 298000, label = "2019", size = 3.5) +
annotate("text", x = ymd("2020-05-16"), y = 298000, label = "2020", size = 3.5)


# - Plot
suppressWarnings(print(g3c))

3c-Note: Some recession rows were intentionally excluded for the purpose of the assignment.

4 - Expect More. Plot More.

Question #4


Recreate the Target LOGO.
Describe steps..
(see code comments)
1. make inner dot
2. Make outer ring
3. Make ‘Target’ use ‘[R]’ symbol
4. Piece it all together.

Code
# library(ggplot2)
# library(ggforce)

# Define Target's signature red color
target_red <- "#e82118"

# Create a data frame for the concentric circles (outer red ring and inner white ring)
circles <- data.frame(
  x0 = 0, y0 = 0,                    # Center of both circles
  r = c(1, 0.7),                     # Radii for the outer red ring and inner white ring
  fill = c(target_red, "white")     # Fill colors for each ring
)

# Construct the plot
g1 <- ggplot() +
  # Draw the red and white rings (two concentric circles)
  geom_circle(data = circles, aes(x0 = x0, y0 = y0, r = r, fill = fill), color = NA) +
  
  # Draw the solid inner red circle (target bullseye center)
  geom_circle(aes(x0 = 0, y0 = 0, r = 0.3), fill = target_red, color = NA) +
  
  # Add the text "TARGET" beneath the bullseye
  annotate("text", x = 0, y = -1.4, label = "TARGET", color = target_red, size = 10, fontface = "bold") +
  
  # Add the registered trademark symbol next to "TARGET"
  annotate("text", x = 0.6, y = -1.52, label = "®", color = target_red, size = 8) +
  
  # Use fill colors as provided (don't map them to a color scale)
  scale_fill_identity() +
  
  # Ensure equal aspect ratio for x and y (perfect circles)
  coord_fixed() +
  
  # Remove all background, axes, and gridlines
  theme_void()

# Render the plot
plot(g1)

5 - Mirror, mirror on the wall, who’s the ugliest of them all?

Question #5


Mirror, mirror on the wall, who’s the ugliest of them all? Make a plot of the variables in the penguins dataset from the palmerpenguins package. Your plot should use at least two variables, but more is fine too. First, make the plot using the default theme and color scales. Then, update the plot to be as ugly as possible. You will probably want to play around with theme options, colors, fonts, etc. The ultimate goal is the ugliest possible plot, and the sky is the limit!

Code
# Load necessary libraries

# Remove rows with missing values
penguins_clean <- na.omit(penguins)

# Create scatter plot: bill length vs flipper length, colored by species
g5a <- ggplot(data = penguins_clean, aes(x = bill_length_mm, y = flipper_length_mm, color = species)) +
  geom_point() +
  labs(
    title = "Plot #1 - normal plot\nBill Length vs Flipper Length by Species",
    x = "Bill Length (mm)",
    y = "Flipper Length (mm)"
  )

# - show it
g5a

Question #5


The ultimate goal is the ugliest possible plot, and the sky is the limit!
Snakes on a plane? No. Penguins on a Sphere!

Code
# Step 1: Clean the data
penguins_clean <- na.omit(penguins)

# Step 2: Map to spherical coordinates
# - θ (theta): polar angle (bill length) mapped to [0, π]
# - φ (phi): azimuthal angle (flipper length) mapped to [0, 2π]
penguins_sphere <- penguins_clean %>%
  mutate(
    theta = rescale(bill_length_mm, to = c(0, pi)),
    phi = rescale(flipper_length_mm, to = c(0, 2 * pi)),
    r = 1,
    x = r * sin(theta) * cos(phi),
    y = r * sin(theta) * sin(phi),
    z = r * cos(theta)
  )

# Step 3: Build the 3D scatter plot
plot_3d <- plot_ly(
  data = penguins_sphere,
  x = ~x, y = ~y, z = ~z,
  type = 'scatter3d',
  mode = 'markers',
  color = ~species,
  colors = c("Adelie" = "red", "Chinstrap" = "green", "Gentoo" = "blue"),
  marker = list(size = 4)
) %>%
  layout(
    title = list(
      text = "Plot #2 – Penguins on a Sphere",
      font = list(size = 25)
    ),
        margin = list(
      t = 100,  # ← increase top margin here
      r = 30
    ),
    scene = list(
      xaxis = list(title = "X"),
      yaxis = list(title = "Y"),
      zaxis = list(title = "Z")
    ),
    legend = list(title = list(text = "<b>Species</b>"))
  )

# Step 4: Add description as an HTML caption below the plot
tagList(
  plot_3d,
  HTML("
    <div style='margin-top: 1em; font-size: 14px; line-height: 1.4em;'>
      <strong>Mapping Description:</strong><br>
      • <b>Bill Length</b> → θ (polar angle, latitude)<br>
      • <b>Flipper Length</b> → φ (azimuthal angle, longitude)<br>
      • Radius is constant: <b>r = 1</b><br>
      • Penguins are plotted on the <b>surface of a unit sphere</b>
    </div>
  ")
)
Mapping Description:
Bill Length → θ (polar angle, latitude)
Flipper Length → φ (azimuthal angle, longitude)
• Radius is constant: r = 1
• Penguins are plotted on the surface of a unit sphere